Aggregation-Based Learning in the Inverted Pendulum Problem

نویسندگان

  • Gerald van den Berg
  • Warren Powell
چکیده

We consider the problem of adapting approximate dynamic programming techniques to the inverted pendulum task. This is a particularly challenging task as we work with a relatively uninformative reinforcement signal and have no a priori information about our system. Success in this task requires an effective solution to the credit assignment problem, incorporation of noisy and biased information into our belief about the system, and efficient learning. We use an aggregation-based Bayesian prior to estimate our value function and explore the performance of the knowledge gradient algorithm relative to other policies. We deal with the credit assignment problem through the use of a decaying trace of the reinforcement signal. Although this updating mechanism violates some assumptions of traditional learning models, we find that the knowledge gradient policy is effective in improving performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inverted Pendulum Control Using Negative Data

   In the training phase of learning algorithms, it is always important to have a suitable training data set. The presence of outliers, noise data, and inappropriate data always affects the performance of existing algorithms. The active learning method (ALM) is one of the powerful tools in soft computing inspired by the computation of the human brain. The operation of this algorithm is complete...

متن کامل

MINIMUM TIME SWING UP AND STABILIZATION OF ROTARY INVERTED PENDULUM USING PULSE STEP CONTROL

This paper proposes an approach for the minimum time swing upof a rotary inverted pendulum. Our rotary inverted pendulum is supported bya pivot arm. The pivot arm rotates in a horizontal plane by means of a servomotor. The opposite end of the arm is instrumented with a joint whose axisis along the radial direction of the motor. A pendulum is suspended at thejoint. The task is to design a contro...

متن کامل

Q Learning based Reinforcement Learning Approach to Bipedal Walking Control

Reinforcement learning has been active research area not only in machine learning but also in control engineering, operation research and robotics in recent years. It is a model free learning control method that can solve Markov decision problems. Q-learning is an incremental dynamic programming procedure that determines the optimal policy in a step-by-step manner. It is an online procedure for...

متن کامل

Reinforcement Learning with Perturbation Method to Turn Unidirectional Linear Response Fuzzy Controller for Inverted Pendulum

In this paper, we present a unidirectional linear response fuzzy controller (FC) to control the inverted pendulum system. The performance of turning fuzzy controller is defined as an evaluation function and our proposed technique, which is based on the integration of reinforcement learning and a perturbation method, is utilized to diversity the search of minimization of the evaluation function....

متن کامل

Pareto Optimal Design Of Decoupled Sliding Mode Control Based On A New Multi-Objective Particle Swarm Optimization Algorithm

One of the most important applications of multi-objective optimization is adjusting parameters ofpractical engineering problems in order to produce a more desirable outcome. In this paper, the decoupled sliding mode control technique (DSMC) is employed to stabilize an inverted pendulum which is a classic example of inherently unstable systems. Furthermore, a new Multi-Objective Particle Swarm O...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011